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Abstract:  Osteogenesis  imperfecta  (01)  is  a  genetic  disease  characterized  by  fragile  bones, 
skeletal  deformities  and,  in  severe  cases,  prenatal  death  that  affects  more  than  1  in  10,000 
individuals.  Here  we  show  by  full  atomistic  simulation  in  explicit  solvent  that  OI  mutations  have  a 
significant  influence  on  the  mechanical  properties  of  single  tropocollagen  molecules,  and  that  the 
severity  of  different  forms  of  OI  is  directly  correlated  with  the  reduction  of  the  mechanical  stiffness 
of  individual  tropocollagen  molecules.  The  reduction  of  molecular  stiffness  provides  insight  into 
the  molecular-scale  mechanisms  of  the  disease.  The  analysis  of  the  molecular  mechanisms  reveals 
that  physical  parameters  of  side-chain  volume  and  hydropathy  index  of  the  mutated  residue 
control  the  loss  of  mechanical  stiffness  of  individual  tropocollagen  molecules.  We  propose  a 
model  that  enables  us  to  predict  the  loss  of  stiffness  based  on  these  physical  characteristics  of 
mutations.  This  finding  provides  an  atomistic-level  mechanistic  understanding  of  the  role  of  OI 
mutations  in  defining  the  properties  of  the  basic  protein  constituents,  which  could  eventually  lead 
to  new  strategies  for  diagnosis  and  treatment  the  disease.  The  focus  on  material  properties  and 
their  role  in  genetic  diseases  is  an  important,  yet  so  far  only  little  explored,  aspect  in  studying  the 
mechanisms  that  lead  to  pathological  conditions.  The  consideration  of  how  material  properties 
change  in  diseases  could  lead  to  a  new  paradigm  that  may  expand  beyond  the  focus  on 
biochemical  readings  alone  and  include  a  characterization  of  material  properties  in  diagnosis  and 
treatment,  an  effort  referred  to  as  materiomics. 
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Introduction 

Patients  affected  by  osteogenesis  imperfecta  (OI)  ex¬ 
hibit  an  array  of  associated  symptoms,  including  short 
stature,  loose  joints,  blue  sclearae,  dentinogenesis 
imperfecta,  hearing  loss,  and  neurological  and  pulmo¬ 
nary  complications/’^  The  classification  of  OI  is  com¬ 
monly  based  on  clinical  features  that  led  to  four  differ¬ 
ent  groups,  from  mild  (OI  type  I)  to  severe  (OI  type 
III  and  IV)  to  perinatal  lethal  (OI  type  II)/  The 
genetic  basis  for  about  90%  of  the  form  of  this  disease 
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Figure  1.  Tropocollagen  molecular  model  and  loading  conditions.  Panel  (a);  Molecular  geometry  of  the  tropocollagen 
molecule,  indicating  the  residues  that  are  replaced  in  the  mutation.  All  tropocollagen-like  peptides  considered  in  this  study 
share  the  same  structure,  consisting  of  three  identical  chains  made  of  gly-pro-hyp  triplets;  [(GPO)5-(XPO)-(GPO)4]3.  The  X 
position  of  each  chain  (highlighted  in  red  in  the  upper  part)  is  one  of  seven  replacing  residues  related  to  OI  (lower  part).  Panel 
(b)  depicts  the  loading  condition,  subjecting  the  tropocollagen  molecule  to  tensile  deformation.  The  N-terminus  of  the 
molecule  is  kept  fix  with  a  strong  position  restrain,  while  the  C-terminus  is  linked  to  a  moving  spring.  [Color  figure  can  be 
viewed  in  the  online  issue,  which  is  available  at  www.interscience.wiley.com.] 


lies  in  mutations  of  type  I  collagen  genes, ^  as  tabulated 
in  the  database  of  human  collagen  mutations 
(URL:  http://www.le.ac.uk/genetics/collagen).^  Mis- 
sense  mutations  that  alter  a  glycine  (Gly)  codon  in  the 
genes  encoding  the  characteristic  collagen  triple  helix 
are  the  most  common  causes  of  01."^  The  replacement 
of  either  guanine  (G)  residue  in  the  glycine  codon 
(GGC)  can  theoretically  result  in  the  replacement  of 
eight  different  amino  acids:  serine  (Ser),  cysteine 
(Cys),  alanine  (Ala),  valine  (Val),  aspartic  acid  (Asp), 
glutamic  acid  (Gin),  arginine  (Arg),  and  tryptophan 
(Trp).  All  possibilities  have  been  described  in  conjunc¬ 
tion  with  01,  although  the  frequency  with  which  the 
different  mutations  occur  varies  considerably,  with 
tryptophan  replacements  being  exceedingly  rare.^ 
Although  a  general  correspondence  between  the  spe¬ 
cific  mutations  and  the  severity  of  01  has  been 
reported,  the  molecular  mechanisms  of  how  a  single 
point  mutation  can  alter  the  susceptibility  of  an  entire 
bone  to  brittle  fracture  are  still  unknown. 

Many  studies  have  attempted  to  correlate  glycine 
mutation  types  and  locations  with  phenotypic  severity. 
Some  trends  are  apparent,  such  as  01  severity  increas¬ 
ing  with  an  amino  to  carboxyl  terminal  orientation 
and  with  substitution  by  large  and  charged  amino 
acids.^“^  At  present,  however,  genotype-phenotype 
correlations  are  too  weak  to  accurately  predict  the 
phenotypic  effect  of  a  particular  glycine  mutation. 
Indeed,  the  answer  to  the  question  how  a  single  point 
mutation  in  the  tropocollagen  molecule  can  cause  the 
failure  of  the  entire  skeletal  system  still  remains  a 
mystery.  In  particular,  it  remains  unclear  at  what  level 
in  the  tissue  structure  the  mutations  influence  the 
behavior.  To  address  these  points,  investigations  at  all 
hierarchical  levels  must  be  carried  out,  beginning  at 
the  molecular  scale.  This  should  begin  ^vith  an  investi¬ 
gation  of  the  effects  at  the  level  of  single  molecules, 
followed  by  studies  of  the  effects  at  the  microfibrillar 


level  (interaction  of  different  peptides,  effects  on  min¬ 
eral  crystal  gro^vth  and  distribution),  eventually  incor¬ 
porating  fibrillar  and  larger-scale  levels  (investigation 
of  composite  materials  featuring  protein  and  mineral 
components). 

Here  we  report  a  series  of  systematic  molecular 
scale  experiments,  carried  out  using  a  molecular  dy¬ 
namics  simulation  approach  that  provides  us  ^vith  the 
ability  to  probe  molecular  mechanics  at  physiologically 
relevant  ultraslow  loading  rates  (details  about  the 
computational  experiments  see  Materials  and  Meth¬ 
ods,  as  well  as  Ref.  lo).  We  consider  seven  collagen- 
like  peptides  with  a  glycine  mutation  as  shovm  in  Fig¬ 
ure  i(a).  The  goal  of  the  study  reported  here  is  to 
investigate  the  effect  of  these  01  mutations  on  the  me¬ 
chanical  properties  of  a  single  tropocollagen  molecule 
under  tensile  stretch  [Fig.  i(b)],  a  physiologically  rele¬ 
vant  mechanical  loading  condition.  Individual  tropo¬ 
collagen  molecules  are  subjected  to  tensile  stretch 
when  the  load  is  applied  at  larger  hierarchical  length 
scales  in  tissues,  such  as  in  collagen  fibrils  or  fibers. 

This  illustrates  the  significance  of  considering  the  ten¬ 
sile  elastic  properties  of  tropocollagen  molecules  to  pro¬ 
vide  a  fundamental  description  of  the  effect  of  01  on 
the  material  properties.  The  mechanical  stiffness  of 
individual  molecules  is  measured  by  the  Young’s  modu¬ 
lus,  which  is  defined  as  the  proportionality  between 
stress  and  strain. 

Results  and  Discussion 

The  Young’s  modulus  of  the  reference  tropocollagen 
molecule  is  determined  to  be  3.96  ±  0.21  GPa  (see  Ta¬ 
ble  I),  in  agreement  with  corresponding  experimental 
results.^"^’^^  However,  molecules  with  glycine  mutation 
display  softer  mechanical  properties  than  the  reference 
peptide,  with  values  ranging  from  3.37  ±  0.32  GPa  to 
3.78  ±  0.29  GPa.  A  decrease  in  Young’s  modulus  up 
to  15%  is  observed,  depending  on  the  specific  type  of 
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Table  I.  Type  of  Glycine  Mutation,  Tropocollagen 
Young’s  Modulus,  and  Severity  of  01 


Replacing 

residue 

Young’s 
modulus  (GPa) 

Severity 

(%) 

Glycine^® 

3.96  ±  0.21 

N/A 

Alanine 

3.78  ±  0.29 

41.6  (5/12)“ 

Arginine 

3.69  ±  0.23 

45-8  (22/48) 

Cysteine 

3.74  ±  0.11 

47.8  (22/46) 

Serine 

3.51  ±  0.10 

65.0  (41/63) 

Valine 

3.91  ±  0.24 

65.2  (15/23) 

Glutamic  acid 

3.38  ±  0.24 

66.6  (6/9) 

Aspartic  acid 

3.37  ±  0.32 

87-5  (35/40) 

All  peptides  featuring  a  glycine  replacement  show  slight  but 
statistically  significant  lower  Young’s  modulus,  which  are  cor¬ 
related  with  the  severity  of  the  01  forms  because  of  the  differ¬ 
ent  mutations.  The  severity  is  here  defined  as  the  ratio 
between  the  severe  occurrences  (leading  to  01  type  II  and 
type  III)  and  the  total  occurrences  of  each  considered  glycine 
replacement.  A  total  of  241  mutations  are  considered  in  this 
analysis.^ 

^  Values  in  parentheses  indicate  ratio  of  severe/total 
occurrences.^ 


replacing  residue  [see  Table  I  and  Fig.  2(a)].  Since  the 
average  values  are  close  and  standard  deviations  large, 
we  performed  an  ANOVA  analysis  to  assess  the  statis¬ 
tical  significance  of  the  results.  Considering  a  signifi¬ 
cance  ot  =  0.05,  the  critical  F  value  is  equal,  in  this 
case,  to  2.947.  The  test  gives  for  our  data  an  F  value  of 
3.500,  which  is  higher  than  the  critical  F  value,  and 
thus  we  can  reject  the  null  hypothesis  (i.e.,  that  there 
is  no  effect  of  mutations  on  Young’s  modulus).  On  the 
other  hand,  the  null  hypothesis  would  have  a  probabil¬ 
ity  P  =  0.011. 

The  observed  mechanical  properties  are  local, 
because  of  the  truncated  model  considered  in  our 
study,  and  therefore  the  mechanical  properties  of  the 
overall  molecule  are  probably  unaffected.  However, 
local  changes  as  observed  in  this  work  could  have  a 
tremendous  impact  (in  ways  yet  to  be  established)  on 
the  onset  of  the  disease,  which  is  characterized  by  cat¬ 
astrophic  failure  of  bone.  Catastrophic  failure  of  mate¬ 
rials  typically  begins  with  a  very  small  defect,  which 
grows  until  catastrophic  failure  occurs.  This  is  a  hall¬ 
mark  of  brittle  fracture,  as  it  is  well  established  in  the 
materials  science  field  (see,  e.g.  Refs.  16-18).  In  such 
models,  stress  concentrations  develop  at  small  crack¬ 
like  flaws,  which  represent  regions  of  structurally  soft 
and  weak  material.  Thereby,  the  crack  tip  provides  a 
mathematical  singularity  for  stresses  even  at  finite 
applied  load,  a  ~  where  r  is  the  distance  from  a 

flaw  and  a  is  the  resulting  stress.  In  this  sense,  one 
can  understand  the  significance  of  local  effects  in 
inducing  catastrophic  failure  of  a  material  due  to  a 
local  defect,  as  they  magnify  a  small  stress  applied  at 
the  overall  tissue  level  to  a  much  greater  value  at  a 
local  scale,  where  it  can  lead  to  a  permanent  damage 
of  tissue  and  grow  into  macroscopic  failure.  Fur¬ 
thermore,  these  peptides  feature  only  Gly-Pro-Hyp 


triplets  around  the  mutations,  and  these  triplets  are 
considered  the  most  stabilizing.  These  triplets  are  also 
the  most  common,  but  still  represents  only  ?^io%  of 
the  total  amount  (in  human  collagen  type  I).  Thus, 
mutations  surrounded  by  different  (and  in  particular, 
less  stabilizing)  triplets  could  lead  to  much  stronger 
effects,  and  therefore,  the  results  presented  here  may 
be  a  conservative  lower-bound  estimate  of  the  effects. 

We  find  that  mutations  related  to  more  severe 
phenotypes  are  associated  with  softer  tropocollagen 
mechanical  properties,  as  illustrated  in  Figure  2(b). 
This  is  confirmed  by  a  linear  curve  fit  to  the  Young’s 
modulus  results  over  the  severity  of  the  disease  for  the 
particular  mutations.  The  severity  parameter  is  calcu¬ 
lated  as  the  ratio  between  the  severe  occurrences 
(leading  to  01  type  II  and  type  III)  and  the  total  occur¬ 
rences  of  each  considered  glycine  replacement.^ 


(a) 


0%  20%  40%  60%  30%  100% 

Mutation  severity 

Figure  2.  Young’s  modulus  of  the  different  peptides  as  a 
function  of  the  glycine  replacement  [panel  (a)]  and  as  a 
function  of  01  severity  [panel  (b)].  In  panel  (a),  the 
mechanical  properties  of  each  different  tropocollagen 
molecule  are  depicted  as  a  function  of  the  replacing  amino 
acid  residues,  which  are  ordered  based  on  the  resulting 
disease  severity,  that  is,  from  the  physiological  glycine  (left) 
to  the  most  severe  01  mutation,  aspartic  acid  (right).  Panel 
(b)  shows  the  Young’s  modulus  relative  to  that  of  the 
reference  (glycine)  as  a  function  of  the  mutation  severity. 

The  severity  parameter,  introduced  for  the  first  time  in  this 
work,  is  meant  to  quantify  the  degree  of  the  01  severity  due 
to  specific  mutations.  It  is  generally  accepted  that  some 
mutations  leads  to  more  severe  phenotype.®  The  severity 
parameter  quantify  this  trend,  using  all  available  data  on  01 
mutations,  that  is  more  than  200  occurrences  catalogued  in 
the  Database  of  Human  Collagen  Mutations  (http:// 
www.le.ac.uk/genetics/collagen).®  [Color  figure  can  be 
viewed  in  the  online  issue,  which  is  available  at 
www.interscience.wiley.com.] 
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Figure  3.  Measured  and  predicted  value  of  Young’s 
modulus.  The  plot  shows  Young’s  modulus  as  measured 
from  atomistic  simulation  for  each  tropocollagen  molecule 
(blue)  is  compared  with  the  mechanical  properties 
calculated  using  Eq.  (1)  (purple).  The  results  from  the  first 
six  peptides  (Gly-Val)  are  use  to  fit  the  parameters  of  Eq. 
(1),  which  then  predicted  the  Young’s  modulus  of  the 
remaining  two  peptides  (Glu  and  Asp).  [Color  figure  can  be 
viewed  in  the  online  issue,  which  is  available  at 
www.interscience.wiley.com.] 


Through  this  analysis,  we  obtained  a  numerical  param¬ 
eter  based  on  a  well-established  classification.  We  note 
that  a  correlation  does  not  necessarily  imply  causality. 
Nonetheless,  it  is  known  that  01  mutations  affect  the 
mechanical  properties  of  collagenous  tissues,  and  thus 
it  provides  the  basis  for  the  hypothesis  that  an  effect 
might  also  be  observed  at  the  molecular  level.  The 
ANOVA  analysis,  which  indeed  suggests  an  effect  of 
mutations  on  the  Young’s  modulus,  confirm  that  the 
observed  correlation  is  likely  due  to  causality. 

We  observe  that  the  mechanical  properties  of  col¬ 
lagen  peptides  are  affected,  to  a  varying  degree,  as  a 
consequence  of  glycine  mutations.  Thus,  we  hypothe¬ 
size  that  some  properties  of  the  replacing  amino  acids 
must  affect  the  peptide  mechanical  properties,  in  this 
case  the  Young’s  modulus.  The  amino  acids  can  in 
principle  be  characterized  through  a  variety  of  physical 
properties.  However,  here  we  focus  on  the  hydropathy 
index  (H)  (a  measure  of  how  hydrophobic  or  hydro¬ 
philic  an  amino  acid  is^^)  and  the  residue  volume  (V). 
These  two  parameters  have  been  chosen,  since  there 
exists  evidence  in  the  literature  that  these  physical 
properties  are  related  to  the  severity  degree  of  the  01 
disease,  as  pointed  out  in  earlier  work.^’^  We  find  that 
the  molecule’s  stiffness  can  be  expressed  as  a  function 
of  these  two  factors.  This  is  achieved  in  an  empirical 
relation  with  five  parameters,  in  a  simple  model: 


the  value  predicted  using  Eq.  (i)  is  found  (see  Fig.  3). 
This  suggests  that  the  physical  parameters  of  side- 
chain  volume  and  hydropathy  index  of  the  replacing 
residue  control  the  loss  of  mechanical  stiffness,  and 
thus  our  results  corroborate  the  hypothesis  that  these 
two  parameters  are  important  in  determining  the  se¬ 
verity  degree  of  the  disease  (albeit  future  studies  could 
focus  on  other  physical  parameters). 

Figure  4  represents  a  contour  graph  of  Young’s 
modulus  as  predicted  from  the  side-chain  volume  and 
the  hydropathy  index  of  the  replacing  residue.  This 
plot  illustrates  that  the  stiffness  decreases  when  gly¬ 
cine  is  replaced  by  larger  residues.  This  observation  is 
consistent  with  the  fact  that  each  third  residue  in  col¬ 
lagen  triple  helix  turn  is  tucked  into  the  sterically  con¬ 
stricted  internal  space  of  the  molecule.  Glycine,  being 
the  smallest  amino  acid,  is  the  only  one  that  can  fit 
this  position  vdthout  destabilizing  the  triple  helix^  (see 
Ramachandran  analysis  shown  in  Fig.  5).  Figure  4 
illustrates  also  that  the  mechanical  properties  of  a  sin¬ 
gle  molecule  are  reduced  with  mildly  hydrophilic 
replacing  residues.  This  could  be  explained  taking  into 
account  that  residues  with  a  high  hydropathy  index  fit 
well  in  the  hydrophobic  protein  core  of  tropocollagen; 
on  the  other  hand,  highly  hydrophilic  residues  could 
partially  compensate  the  destabilizing  effect  on  the  tri¬ 
ple  helix  through  a  larger  number  of  water-mediated 
H  bonds.^°  Indeed,  the  valine  mutation  is  the  only  one 
that  does  not  follow  the  general  trend,  being  a  muta¬ 
tion  that  usually  leads  to  a  severe  phenotype  while  not 
reducing  the  mechanical  properties  of  the  peptide.  The 
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Y  =  aH^^  bV^  +cH^dV  +  e  (1) 

We  fit  the  parameters  a,  b,  c,  d,  and  e  in  Eq.  (1) 
based  on  the  results  for  six  peptides  (Gly,  Ala,  Ser, 
Cys,  Arg,  Val),  for  a  total  of  24  independent  data 
points.  This  then  enables  us  to  predict  the  Young’s 
modulus  for  the  remaining  two  peptides  (Glu  and  Asp) 
that  were  not  included  in  the  initial  fitting.  A  good 
agreement  between  the  results  of  the  simulations  and 


Figure  4.  Influence  of  the  hydropathy  index  and  the  side- 
chain  volume  of  replacing  residues  on  tropocollagen 
Young’s  modulus.  Most  severe  cases  of  01  fall  into  the  dark 
blue  domain,  with  a  hydropathy  index  of  -3.75  and  a  side- 
chain  volume  of  ?^100  A^.  The  most  severe  mutations 
feature  the  smallest  distance  from  the  center  of  the  dark 
blue  domain  (distances  of  individual  cases  to  the  center  of 
the  dark  blue  domain  are  indicated  by  white  lines).  [Color 
figure  can  be  viewed  in  the  online  issue,  which  is  available 
at  www.interscience.wiley.com.] 
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Figure  5.  Ramachandran  plot  relative  to  the  mutation  positions  and  to  the  two  adjacent  amino  acids  in  each  direction.  In  the 
reference  case  [panel  (a)]  the  configuration  is  close  to  that  of  the  polyproline  II  chain  [Psi  =  150°;  Phi  =  -75°;  red  circle  in 
panel  (a)].  The  presence  of  the  mutations  alters  the  coiling  of  the  molecule,  shown  by  the  scattering  in  the  Phi  and  Psi  values 
[panel  (b)].  In  the  case  of  valine  [panel  (c)]  this  scattering  is  reduced,  leading  to  a  coiling  similar  to  the  reference  case.  [Color 
figure  can  be  viewed  in  the  online  issue,  which  is  available  at  www.interscience.wiley.com.] 


reason  is  not  obvious,  but  could  be  related  to  the  high 
hydropathy  index:  since  the  position  usually  occupied 
by  glycine  is  tucked  into  the  core  of  the  protein,  a  sub¬ 
stitution  by  a  highly  hydrophobic  residue  such  as  va¬ 
line  vv^ould  lead  to  lower  disruption  of  the  triple  helix. 
This  is  indeed  observed  in  the  molecular  structure  of 
the  peptides  at  the  end  of  the  equilibration  stage. 
While  the  other  mutations  affect  the  coiling  of  the 
molecule,  the  valine  mutation  does  not  (see  Fig.  5). 

The  results  of  our  investigation  show  that  the  me¬ 
chanical  properties  of  collagen  peptides  are  affected  by 
glycine  mutations,  and  that  the  degree  of  this  reduc¬ 
tion  is  related  to  the  type  of  mutation.  The  underlying 
physical  reason  seems  to  be  related  to  the  degree  of 
helical  disruption,  since  glycine  substitutions  lead  to 
local  disruption  of  the  triple  helical  arrangement.  This 
is  not  observed  in  the  case  of  valine  mutations,  and 
indeed  this  is  reflected  in  the  mechanical  properties  of 
the  peptide,  which  are  similar  to  those  of  the  reference 
peptide.  However,  despite  the  structural  changes,  no 
effect  on  the  number  of  interchain  H-bonds  is 
observed,  which  can  explain  the  relative  slight  reduc¬ 
tion  of  mechanical  properties  for  mutated  peptides. 

It  has  been  speculated  earlier  that  the  type  of 
mutation,  together  with  the  position  along  the  triple 
helix,  influences  the  phenotypic  severity  of  01.  In  par¬ 
ticular  it  has  been  observed  that  large  and  charged  res¬ 
idues  generally  lead  to  more  severe  01  types.^“^  Here 
we  have  shown  that  the  local  mechanical  properties  of 
single  tropocollagen  molecules  can  be  predicted  based 
on  the  physical  properties  of  the  replacing  residue. 

Conclusions 

The  results  reported  here  may  contribute  to  a  better 
understanding  of  the  molecular  mechanisms  underly¬ 
ing  01  and  could  eventually  enable  us  to  establish  a 
direct  link  from  the  scale  of  genetic  mutations  to  the 
macroscale  phenomenon  of  brittle  bones.  The  present 


study  shows  that  the  loss  of  stiffness  in  tropocollagen 
molecules  and  the  severity  of  01  are  correlated.  Our 
findings  suggest  that  the  changes  in  the  molecular 
structure  due  to  the  01  mutations  change  properties 
already  at  the  single  molecule  level.  This  is  in  contrast 
to  recent  studies  of  mutations  related  to  muscle  dys¬ 
trophies^^  that  have  shown  that  the  properties  of  indi¬ 
vidual  ot-helical  protein  molecules  are  not  affected, 
suggesting  that  the  disease  has  its  origin  at  larger  hier¬ 
archical  levels.  The  mechanism  involved,  however,  is 
still  unclear.  The  development  of  understanding  of 
how  structural  modifications  lead  to  the  changes  in 
the  mechanical  properties,  as  observed  in  the  present 
work,  is  critically  important  on  the  path  of  forming  a 
more  holistic  picture  of  01.  Extending  this  work  to 
consider  larger  length  scales,  multiple  hierarchical  lev¬ 
els  or  other  structural  modifications  is  a  complex  task 
that  will  be  the  aim  of  future  works.  Predicting  the 
folded  structure  of  collagenous  protein  structures,  for 
example,  using  approaches  developed  in  Pande’s  lab,^^ 
could  be  used  in  conjunction  with  mechanical  analyses 
as  reported  here. 

We  note  that  the  results  reported  here  alone  can¬ 
not  fully  explain  the  molecular  mechanism  of  the  dis¬ 
ease,  and  other  effects  must  be  taken  into  considera¬ 
tion,  such  as  effects  at  larger  hierarchical  scales. 
However,  this  is  a  first  step,  which  shows  that  a  single 
point  mutation  (i.e.,  1  residue  out  of  30,  corresponding 
to  3.3%  of  the  total  length)  can  affect  the  molecular 
mechanical  properties  by  as  much  as  15%.  In  order  to 
evaluate  the  importance  of  these  results,  it  is  impor¬ 
tant  to  consider  that  01  itself  is  based  on  apparently 
negligible  effects:  a  single  point  mutation  in  a  triple 
helix  of  more  than  1000  residues  (i.e.,  less  than  0.1% 
of  the  total  residues)  may  lead  to  failure  of  the  entire 
skeletal  system.  Thus,  in  order  to  understand  the  dis¬ 
ease  mechanism  of  01,  we  should  not  neglect  local 
effects  simply  based  on  the  hypothesis  that  they 


VOL  18:161-168 


Gautieri  et  al. 


cannot  immediately  explain  the  entire  phenomenon. 
Rather,  we  should  expand  from  these  findings  to  con¬ 
tinue  the  investigation  to  include  more  structural 
length  scales  (fibrils,  fibers,  mineralized  fibrils,  etc.). 
Along  similar  lines,  as  discussed  earlier,  it  is  well 
established  in  the  materials  science  field  that  local 
defects  and  flaws  can  lead  to  catastrophic  brittle  fail¬ 
ure,  because  of  large  interatomic/intermolecular  forces 
that  generate  local  shear  regions  because  of  localized 
stress  magnifications  that  greatly  exceed  the  stress 
applied  at  a  global,  tissue  scale.  This  concept,  together 
with  the  findings  put  forward  in  this  article  provides  a 
possible  avenue  of  future  investigation. 

The  present  study  has  been  focused  solely  at  mo¬ 
lecular  scale  phenomena.  For  understanding  the  more 
macroscopic  and  therefore  clinically  relevant  aspects 
of  01,  one  must  consider  the  effects  of  the  change  of 
single  molecule  properties  on  larger-scale  tissue  prop¬ 
erties.  This  could  be  addressed  by  using  a  multiscale 
scheme  of  collagen  modeling  as  reported  in  earlier 
studies,^^’^^’^^’^"^  which  could  be  based  on  the  results 
reported  here.  A  careful  atomistic-scale  molecular  dy¬ 
namics  study  of  the  properties  of  the  material’s  basic 
constituents  as  reported  here  is  crucial  for  the 
advancement  of  these  models,  as  they  provide  the  fun¬ 
damental  input  parameter  of  the  constitutive  behavior 
of  the  material’s  basic  building  blocks.  The  effect  of 
locally  softened  domains  on  the  behavior  of  collagen 
fibrils  could  be  explored. 

After  identifying  the  entire  genetic  code  of  several 
species,  an  outstanding  grand  challenge  in  the  life  sci¬ 
ences  is  the  understanding  of  the  multiscale  behavior 
of  hierarchical  protein  assemblies.  The  advancement  of 
this  field  is  crucial  for  studies  of  biological  systems, 
disease  diagnosis,  and  treatment,  as  well  as  the  design 
of  novel  biomaterials.  The  focus  on  material  properties 
and  their  role  in  diseases  is  an  important,  up  to  date 
little  explored  aspect  in  studying  the  mechanisms  that 
lead  to  pathological  conditions.  Thereby,  the  consider¬ 
ation  of  how  material  properties  change  in  diseases 
could  lead  to  a  new  paradigm  that  may  expand  beyond 
the  focus  on  biochemical  readings  alone,  an  effort 
referred  to  as  materiomics.  This  could  eventually  be 
important  for  both  disease  diagnosis  and  treatment. 

Methods 

Single  molecule  full  atomistic  models 
In  explicit  water 

The  mechanical  properties  of  tropocollagen  molecules 
are  investigated  using  steered  molecular  dynamics 
(SMD)  simulations,  submitting  the  tropocollagen  mol¬ 
ecule  to  traction  along  the  principal  axis,  that  is,  from 
the  N-terminus  to  the  C-terminus  [see  inlay  in  Fig. 
1(b)]. 

We  build  tropocollagen  molecule  models  with  var¬ 
ious  sequences  using  the  software  THeBuScr  (Triple- 
Ffelical  collagen  Building  Script). We  choose  the 


simplest  model  of  tropocollagen,  with  only  Gly-Pro- 
Hyp  (GPO)  triplets  on  each  of  the  three  chains  as  the 
reference  system  (Hyp  and  O  are,  respectively,  the 
three-letter  code  and  single-letter  code  for  the  amino 
acid  hydroxy  proline).  The  central  triplet  is  used  to 
introduce  the  Gly  replacement.  The  peptide  structure 
is  [(GP0)5-(XP0)-(GP0)4]3,  where  the  X  position  is 
occupied  by  alanine,  serine,  cysteine,  arginine,  valine, 
glutamic  acid,  or  aspartic  acid  [see  Fig.  i(a)].  For  all 
the  peptides,  the  N-terminals  are  assumed  protonated 
while  the  C-terminals  are  assumed  deprotonated.  The 
tropocollagen  models  we  use  are  truncated  at  30 
amino  acids  per  chain  to  reduce  computational  costs. 
This  leads  to  short-length  tropocollagen  segments  with 
a  length  of  nm.  For  reasons  of  comparison,  pepti¬ 
des  of  comparable  length  were  used  both  in  computa¬ 
tional  and  experimental  studies.^’^^“^° 

Because  of  the  short  length  (the  molecule  consid¬ 
ered  here  are  shorter  than  its  persistence  length  of 
~  10-20  nm),  entropic  elasticity  is  a  minor  aspect  and 
its  effects  are  not  significant  in  the  present  model. 
This  approach  enables  us  to  focus  on  energetic  elastic 
effects^^’^^  and  their  change  under  01  mutations. 

Model  equilibration 

Molecular  dynamics  simulations  are  performed  using 
the  GROMACS  code^^’^^  and  the  GROMOS96  43ai 
force  field,  as  used  in  earlier  studies,^®  which  includes 
also  parameters  for  the  hydroxy  proline  (HYP)  residue. 
The  protein  molecules  are  entirely  solvated  in  a  15  nm 
X  3  nm  X  3  nm  periodic  water  box  (which  ensure  a 
minimum  distance  of  0.8  nm  between  the  protein  and 
the  box  edge).  Single  point  charge  water  molecules  is 
used  for  the  solvent,  leading  to  a  total  of  ?^i2,8oo 
atoms  for  each  system.  SETTLE  (for  water)  and  LINGS 
algorithms  are  used  to  constrain  covalent  bond  lengths 
involving  hydrogen  atoms,  thus  allowing  a  time  step  of 
2  fs.  Nonbonding  interactions  are  computed  using  a 
cutoff  for  neighbor  list  at  1  nm,  with  a  switching  func¬ 
tion  between  0.8  and  0.9  nm  for  Van  der  Waals  inter¬ 
actions,  while  the  Particle-Mesh  Ewald  sums  method 
is  applied  to  describe  electrostatic  interactions.  In  the 
case  of  charged  peptides,  counter  ions  (Cl“  or  Na^) 
were  added  to  keep  the  system  neutral.  The  prelimi¬ 
nary  system  energy  minimization  is  performed  by 
using  a  steepest  descent  algorithm  until  convergence 
or  for  a  maximum  of  10,000  steps.  The  system  is  then 
equilibrated  at  a  temperature  of  310  K  (37°  C)  for  1200 
ps  of  molecular  dynamics.  The  entire  protein  is  held 
fixed  for  the  first  200  ps  by  restraining  the  atomic 
positions,  and  thereafter  only  the  first  and  the  last  C^- 
atoms  of  each  chain  are  restrained  for  the  following 
1000  ps.  We  observe  that,  even  in  the  presence  of 
large  mutant  amino  acids,  the  root-mean-square  devia¬ 
tion  of  the  proteins  reaches  a  stable  value  within  the 
i-ns  simulation  time.  Thus,  we  assume  that  the  pepti¬ 
des  are  equilibrated  properly. 
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SMD  approach 

Four  different  configurations  are  extracted  during  the 
last  300  ps  of  equilibration  for  each  peptide,  and  are 
then  used  as  independent  starting  points  for  subse¬ 
quent  SMD  simulations.  This  leads  to  a  total  of  32 
cases  (i.e.,  four  replicas  of  eight  peptides).  In  each 
case,  the  center  of  mass  of  the  three  N-terminal  0^ 
atoms  is  kept  fixed  by  means  of  a  strong  harmonic 
restrain  with  a  spring  constant  of  3  x  10^  kJ  mol“^ 
nm“^,  while  the  center  of  mass  of  the  three  C-terminal 
Cot  atoms  (the  pulled  group  of  atoms)  is  linked  to  a 
spring  with  an  elastic  constant  ^spring  =  4000  kJ 
mol“^  nm“^,  which  is  moved  along  the  direction  of  the 
molecular  axis  with  a  velocity  of  0.1  m  s“^.  Each  SMD 
simulation  is  carried  out  to  model  a  time  span  of  70 
ns  for  each  peptide,  leading  to  a  total  simulation  time 
of  more  than  2  ps.  The  choice  of  the  pulled  and  refer¬ 
ence  groups  is  arbitrary,  but  since  the  models  are  sym¬ 
metric,  we  do  not  expect  a  different  behavior  if  the 
N-terminus  was  pulled  instead  of  the  C-terminus.  The 
pulling  rate  is  chosen  based  on  previous  results,  in 
which  we  showed  that  the  mechanical  properties  of 
tropocollagen  molecules  are  rate  dependent  and  con¬ 
verge  to  a  finite  value  for  pulling  velocities  <1  m  s~^ 
(please  see  Ref.  10  and  discussion  mentioned  later  in 
the  section  “Rate  dependence  in  molecular  simulation 
studies”). 

All  molecular  dynamics  simulations  are  carried 
out  in  an  NPT  ensemble,  with  the  systems  coupled  to 
a  heat  bath  at  310  K  (coupling  constant  of  0.1  ps  and 
Berendsen  thermostat)  and  to  an  hydrostatic  bath  at  1 
atm  (coupling  constant  of  0.5  ps  and  Berendsen  baro- 
stat).  The  force  applied  to  the  tropocollagen  molecule 
by  the  virtual  spring  is 

F{t)  =  ^spring  (^spring  (^)  “  -^pulK^))  (2) 

where  Xgpring  and  Xpun  represent  the  spring’s  and  the 
pulled  group’s  positions,  respectively. 


Definition  of  eiastic  properties:  Young's 
moduiuSj  stress,  and  strain 

Young’s  modulus  relates  stress  and  strain  and  is  a 
commonly  used  engineering  measure  to  characterize 
the  stiffness  of  a  material.  The  molecular  elastic  spring 
constant  of  tropocollagen,  kxc,  is  calculated  by  fitting 
the  force  F  versus  AL  relationship  (the  parameter  L  is 
the  tropocollagen  molecular  length)  and  by  consider¬ 
ing  the  value  of  its  derivative.  Since  the  stress-strain 
relationship  is  nonlinear  there  are  different  possible 
choices  of  how  to  calculate  k^c  and  thus  Young’s  mod¬ 
ulus.  We  chose  to  consistently  consider  the  value  at 
8%  applied  strain.  For  strains  smaller  than  8%,  the 
tropocollagen  molecules  are  crimped,  so  they  are  not 
under  tension  (the  molecules  are  slightly  bent  in  the 
initial  configuration).  The  result  thus  represents 
Young’s  modulus  at  small  strains. 


The  Young’s  modulus  Y  is  calculated  based  on 


a  F/A  FLq  Lo 


(3) 


The  strain  is  given  by 


L-Lo 

Lo 


(4) 


where  8  is  the  engineering  strain  in  the  measured 
direction.  In  Eqs.  (3)  and  (4),  L  and  Lq  are  the  current 
and  the  initial  tropocollagen  length,  respectively;  F  is 
the  force  inducing  the  molecule  to  elongate  by  AL  =  L 
-  Lq.  The  parameter  A  denotes  the  cross-sectional 
area  of  the  tropocollagen  molecule,  obtained  from  the 
ratio  between  the  molecular  volume  and  Lq,  by  assum¬ 
ing  a  cylindrical  shape.  Note  that  the  molecular  stress 
is  defined  as  a  =  F  /  A. 

In  order  to  estimate  the  statistical  accuracy  of  the 
determined  elastic  moduli,  we  perform  four  simula¬ 
tions  with  different  initial  configurations  for  each  case 
studied  (i.e.,  for  each  of  the  eight  peptides)  and  deter¬ 
mine  the  average  values  and  standard  deviations. 


Empiricai  reiationship  between  biochemicai 
and  mechanicai  properties 

We  express  the  molecule’s  stiffness  as  a  function  of 
two  main  biochemical  properties  of  mutant  amino 
acids,  that  is  the  residue  volume  and  the  hydropathy 
index.  This  is  achieved  in  an  empirical  relation  with 
five  parameters  [see  Eq.  (1)].  We  fit  the  parameters  a, 
b,  c,  d,  and  e  in  Eq.  (1)  based  on  the  results  for  six 
peptides  (Gly,  Ala,  Ser,  Cys,  Arg,  Val),  for  a  total  of  24 
independent  data  points.  This  then  enables  us  to  pre¬ 
dict  the  Young’s  modulus  for  the  remaining  two  pepti¬ 
des  (Glu  and  Asp)  that  are  not  included  in  the  initial 
fitting.  We  use  five  parameters,  since  it  is  the  mini¬ 
mum  number  in  order  to  capture  the  quadratic  trend 
relative  to  both  side-chain  volume  and  hydropathy 
index.  Indeed,  if  fewer  parameters  are  used,  it  leads  to 
greater  errors  in  the  least  square  analysis  and  poorer 
prediction  of  the  Young’s  moduli  of  the  peptides  not 
included  in  the  fitting. 


Rate  dependence  in  moiecuiar 
simuiation  studies 

Molecular  dynamics  simulations  are  typically  carried 
out  at  very  high  deformation  rates.  We  have  therefore 
taken  special  care  in  the  interpretation  of  the  results, 
and  ensured  that  the  measured  elastic  properties  do 
not  depend  on  the  deformation  speed.  We  previously 
studied  the  mechanical  behavior  of  tropocollagen  in 
terms  of  its  Young’s  modulus  for  under  a  systematic 
variation  of  loading  rates.^°  It  was  found  that  for  pull¬ 
ing  rates  lower  than  1  m  s~^  the  Young’s  modulus  of 
single  tropocollagen  molecules  is  independent  on  the 
loading  rate.  All  simulations  reported  in  this  paper  are 
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thus  carried  out  at  a  loading  rate  of  o.i  m  s~^  and 
therefore  in  the  regime  in  which  the  Young’s  modulus 
has  converged. 
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